Highlighted role of VEGFA in follow up of celiac disease.

Aim
Evolution of gene expression change of intestine tissue in celiac patients to find a new molecular prospective of disease is the aim of this study.


Background
Celiac disease (CD) as an autoimmune disease is known as an immune reaction response to the gluten in patients. It is reported that genetic and environmental conditions are important in onset and progress of CD.


Methods
gene expression profiles of intestinal tissue in 12 celiac patients and 12 healthy controls from gene expression omnibus (GEO) were downloaded and verified by boxplot analysis. The significant and selected differentially expressed genes (DEGs) were included protein-protein interaction (PPI) network analysis. The central nodes were identified by network analyzer.


Results
The network was constructed from 161 query DEGs and 50 additional neighbors. GTF2H1, VEGFA, SUMO1, RAD51, MED21, BBP4, LEP, and MAP2K7 as potent hub nodes LRP5, RABGEF1, BCAS2, DYRK1B, AOC3, RABL2A, CRTAP, VEGFA, and SPOPL as potent bottlenecks are introduced as crucial nodes.


Conclusion
Among the crucial DEGs, Vascular endothelial growth factor A (VEGFA) was highlighted as an important biomarker candidate for follow up of celiac patients.


Introduction
One of the autoimmune diseases which is characterized by sensing and immune reaction response to gluten is celiac disease (CD) (1,2). The role of environmental and genetic factors in CD onset and promotion is confirmed and discussed in details (3). Nutrition deficiency in CD patients may lead to some disorders such as iron deficiency anemia and osteoporosis which are suffering conditions for the patients (4). 0.5%-1% of general population experience CD (5). Two diagnostic methods including initial via PPI network analysis to find efficient diagnostic therapeutic biomarkers (10). There are several documents that are related to the analysis of diseases staging via this approach (11,12). In the present study gene expression change of celiac patients relative to the healthy control is investigated by PPI network analysis to revile more molecular aspect of this disease.

Methods
Gene expression profiles of celiac patients and controls which are recorded in GEO under name of GSE112102/GPL10558 were extracted. The samples were obtained from upper GI by endoscopy examination via multiple mucosal. Intestinal multiple mucosal biopsies of 7 males and 5 females aged 18-40 years, plus 10 males and 2 females aged 21-42 years were selected as patients and controls, respectively. Gene expression distributions were investigated via boxplot analysis by GEO2R. Numbers of top 250 DEGs were identified for more analysis. FC less than 1.5, P-value more than 0.05, and uncharacterized DEGs were excluded. The remained DEGs plus 50 neighbors were included in the PPI network by Cytoscape v 3.6.0 software (13). The main connected component was analyzed based on centrality parameters. Top 10 nodes based on D, BC, CC, and S were determined to find the central genes. The hub nodes that were included in at least another group were identified as central nodes. The bottleneck nodes which were common with the other group were highlighted as crucial genes. Finally, the central and crucial nodes were evaluated to find action relationship between them.

Results
Gene expression profiles of 12 control and 12 celiac patients were matched via boxplot analysis. As it is shown in figure 1 the profiles are comparable because middle of the samples are matched. However, distributions of patients' expression values are more extended relative to the control samples.
Top 250 DEGs which differentiate patients from controls were extracted and based on LogFC more than 0.6 and less than -0.6 and also p-value less than 0.05 were screened. Among 250 DEGs 170 ones were characterized and considered to import in STRING database via Cytoscape software. Numbers of 161 DEGs were recognized by STRING database that were included in PPI network.  As it is shown in table 1 there are poor interactions between the nodes, and most of the nodes were isolated; therefore, 50 relevant genes were added and the network was constructed. Importing the 50 neighbor genes to the query genes improved interactions between the nodes, but several nodes were isolated. For better advantage, 100 relevant genes were added the query genes but result was not significantly changed. Thus, 50 additional neighbors were considered to construct the final PPI network. The network includes 4 connected components and 56 isolated nodes. Numbers of 98 query DEGs and 50 additional genes were consisted in the main connected component which are connected by 1325 edges (see figure 2). Hub-nodes which are common with the top nodes based on CC are tabulated in the table 2. As it is shown in the table 3 there are 9 bottlenecks that are common with the top nodes based on stress. Action relationships between central and crucial nodes are shown in the figure 3.

Discussion
There are several studies about colon cancer (14), gastric cancer (15) and the other types of gastrointestinal diseases that are established based on PPI network analysis. In this study determination of early stage biomarkers, identification of drug targets, and presents of new perspective of CD molecular mechanism are investigated. Since serum, cell lines, and tissue are suitable sources for sample preparation, intestinal tissue of celiac patients is selected to be analyze based on gene expression pattern. As it is shown in the figure 1 the gene expression profiles of the patients and controls are median centered, thus they are comparable statistically. It is expected that there are  figure 2 adding the limited numbers of relevant genes to query genes was lead to scale free network. Central nodes such as hubs and bottlenecks are introduced as critical nodes of the many disease networks. In the tables 2 and 3 the hubs and bottlenecks which at least are highlighted based on other centrality parameters are identified. It must be mentioned that these 16 central nodes are selected among the query genes and not from the added neighbor genes. A powerful central node among hubs and bottlenecks is VEGFA and is the only hub-bottleneck node among the hubs and bottleneck which is included the top nodes based on closeness centrality and stress. Highlighting  VEGFA as a critical DEG assumes that it is involved deeply in the regulation of the other central nodes. In the figure 3 crucial role of VEGFA as a regulatory DEG is emphasized. VEGFA binds to AOC3 and activates and up regulates LEP. Based on the findings it seems that VEGFA plays a critical role in the CD. Following discussion is designed to find interfering VEGFA with CD. VEGF and its receptor are both important in angiogenesis under physiological and pathological conditions, it is reported that VEGF plays crucial role in cancer promotion. VEGFA as one member of VEGF family is also involve in angiogenesis (16). It is reported that small bowel mucosal of CD patients generates immunologically active molecules which effect liver of patients. In this regard, high levels of angiogenic factors such as VEGF cause development of vascular lesions in CD patients. There is evidence that mucosal VEGA is overexpressed in CD patients (17). Potent role of VEGF in angiogenic, mitogenic, permeability, and fibrosis enhancing peptide in collagenous colitis is reported in 2004 by Yesuf Taha et al.. In this study, increased perfusion of VEGF from descending colon and rectum of patients is confirmed; however, serum level of VEGF is unchanged (18). VEGF and EGFR are included in a PPI network that was constructed for celiac patents. The samples were human peripheral blood mononuclear cells. However, both EGFR and VEGF were not included in the query proteins. These tow proteins were added to the query proteins as related neighbors (7). Since VEGF expression change is reported for several diseases such as different types of cancers it is not suitable as specific biomarker or for celiac but it can be used for patients' follow up.